77 research outputs found
The effect of β-cyclocitral treatment on the carotenoid content of transgenic Marsh grapefruit (Citrus paradisi Macf.) suspension-cultured cells
Zheng, Xiongjie, Zhu, Kaijie, Ye, Junli, Price, Elliott J., Deng, Xiuxin, Fraser, Paul D. (2020): The effect of β-cyclocitral treatment on the carotenoid content of transgenic Marsh grapefruit (Citrus paradisi Macf.) suspension-cultured cells. Phytochemistry (112509) 180: 1-8, DOI: 10.1016/j.phytochem.2020.112509, URL: http://dx.doi.org/10.1016/j.phytochem.2020.11250
EmotionPrompt: Leveraging Psychology for Large Language Models Enhancement via Emotional Stimulus
Large language models (LLMs) have achieved significant performance in many
fields such as reasoning, language understanding, and math problem-solving, and
are regarded as a crucial step to artificial general intelligence (AGI).
However, the sensitivity of LLMs to prompts remains a major bottleneck for
their daily adoption. In this paper, we take inspiration from psychology and
propose EmotionPrompt to explore emotional intelligence to enhance the
performance of LLMs. EmotionPrompt operates on a remarkably straightforward
principle: the incorporation of emotional stimulus into prompts. Experimental
results demonstrate that our EmotionPrompt, using the same single prompt
templates, significantly outperforms original zero-shot prompt and
Zero-shot-CoT on 8 tasks with diverse models: ChatGPT, Vicuna-13b, Bloom, and
T5. Further, EmotionPrompt was observed to improve both truthfulness and
informativeness. We believe that EmotionPrompt heralds a novel avenue for
exploring interdisciplinary knowledge for humans-LLMs interaction.Comment: Work in progress; 9 page
CompeteAI: Understanding the Competition Behaviors in Large Language Model-based Agents
Large language models (LLMs) have been widely used as agents to complete
different tasks, such as personal assistance or event planning. While most work
has focused on cooperation and collaboration between agents, little work
explores competition, another important mechanism that fosters the development
of society and economy. In this paper, we seek to examine the competition
behaviors in LLM-based agents. We first propose a general framework to study
the competition between agents. Then, we implement a practical competitive
environment using GPT-4 to simulate a virtual town with two types of agents,
including restaurant agents and customer agents. Specifically, restaurant
agents compete with each other to attract more customers, where the competition
fosters them to transform, such as cultivating new operating strategies. The
results of our experiments reveal several interesting findings ranging from
social learning to Matthew Effect, which aligns well with existing sociological
and economic theories. We believe that competition between agents deserves
further investigation to help us understand society better. The code will be
released soon.Comment: Technical report; 21 page
Clustering-Structure Representative Sampling from Graph Streams
Most existing sampling algorithms on graphs (i.e., network-structured data) focus on sampling from memory-resident static graphs and assume the entire graphs are always available. However, the graphs encountered in modern applications are often too large and/or too dynamic to be processed with limited memory.Furthermore, existing sampling techniques are inadequate for preserving the inherent clustering structure, which is an essential property of complex networks.To tackle these problems, we propose a new sampling algorithm that dynamically maintains a representative sample and is capable of retaining clustering structure in graph streams at any time.Performance of the proposed algorithm is evaluated through empirical experiments using real-world networks. The experimental results have shown that our proposed \textit{CPIES} algorithm can produce clustering-structure representative samples and outperforms current online sampling algorithms
PromptBench: Towards Evaluating the Robustness of Large Language Models on Adversarial Prompts
The increasing reliance on Large Language Models (LLMs) across academia and
industry necessitates a comprehensive understanding of their robustness to
prompts. In response to this vital need, we introduce PromptBench, a robustness
benchmark designed to measure LLMs' resilience to adversarial prompts. This
study uses a plethora of adversarial textual attacks targeting prompts across
multiple levels: character, word, sentence, and semantic. These prompts are
then employed in diverse tasks, such as sentiment analysis, natural language
inference, reading comprehension, machine translation, and math
problem-solving. Our study generates 4,032 adversarial prompts, meticulously
evaluated over 8 tasks and 13 datasets, with 567,084 test samples in total. Our
findings demonstrate that contemporary LLMs are vulnerable to adversarial
prompts. Furthermore, we present comprehensive analysis to understand the
mystery behind prompt robustness and its transferability. We then offer
insightful robustness analysis and pragmatic recommendations for prompt
composition, beneficial to both researchers and everyday users. We make our
code, prompts, and methodologies to generate adversarial prompts publicly
accessible, thereby enabling and encouraging collaborative exploration in this
pivotal field: https://github.com/microsoft/promptbench.Comment: Technical report; 23 pages; code is at:
https://github.com/microsoft/promptbenc
A Survey on Evaluation of Large Language Models
Large language models (LLMs) are gaining increasing popularity in both
academia and industry, owing to their unprecedented performance in various
applications. As LLMs continue to play a vital role in both research and daily
use, their evaluation becomes increasingly critical, not only at the task
level, but also at the society level for better understanding of their
potential risks. Over the past years, significant efforts have been made to
examine LLMs from various perspectives. This paper presents a comprehensive
review of these evaluation methods for LLMs, focusing on three key dimensions:
what to evaluate, where to evaluate, and how to evaluate. Firstly, we provide
an overview from the perspective of evaluation tasks, encompassing general
natural language processing tasks, reasoning, medical usage, ethics,
educations, natural and social sciences, agent applications, and other areas.
Secondly, we answer the `where' and `how' questions by diving into the
evaluation methods and benchmarks, which serve as crucial components in
assessing performance of LLMs. Then, we summarize the success and failure cases
of LLMs in different tasks. Finally, we shed light on several future challenges
that lie ahead in LLMs evaluation. Our aim is to offer invaluable insights to
researchers in the realm of LLMs evaluation, thereby aiding the development of
more proficient LLMs. Our key point is that evaluation should be treated as an
essential discipline to better assist the development of LLMs. We consistently
maintain the related open-source materials at:
https://github.com/MLGroupJLU/LLM-eval-survey.Comment: 23 page
CYP2C19 genotype and platelet aggregation test-guided dual antiplatelet therapy after off-pump coronary artery bypass grafting: A retrospective cohort study
BackgroundDual antiplatelet therapy (DAPT) is recommended in patients undergoing off-pump coronary artery bypass graft surgery (OPCAB). Clopidogrel is less effective among patients with loss-of-function (LoF) of CYP2C19 alleles, while ticagrelor has direct effects on P2Y12 receptor. Whether a CYP2C19 genotype plus platelet aggregation test (PAgT)-guided DAPT after CABG could improve clinical outcomes remain uncertain.Materials and methodsFrom August 2019 to December 2020, 1,134 consecutive patients who underwent OPCAB received DAPT for 1 year after surgery in Ruijin Hospital, Shanghai Jiao Tong University School of Medicine. According to the actual treatment they received in real-world, 382 (33.7%) of them received a traditional DAPT: aspirin 100 mg qd + clopidogrel 75 mg qd, no matter the CYP2C19 genotype and response in platelet aggregation test (PAgT). The other 752 (66.3%) patients received an individual DAPT based on CYP2C19 genotype and PAgT: aspirin 100 mg qd + clopidogrel 75 mg qd if CYP2C19 was extensive metabolizer, or moderate metabolizer but normal response in PAgT; aspirin 100 mg qd + ticagrelor 90 mg bid if CYP2C19 was poor metabolizer, or moderate metabolizer but no or low response in PAgT. One-year follow-up was achieved for all patients. The primary outcome was major adverse cardiovascular events (MACE), a composite of cardiovascular death, myocardial infarction, and stroke. The safety outcome was thrombolysis in myocardial infarction (TIMI) criteria major bleeding.ResultsCompared with the traditional DAPT group, the risk of MACE in the individual DAPT group was significantly lower (5.5 vs. 9.2%, HR 0.583; 95% CI, 0.371–0.915; P = 0.019), mainly due to the decreased risk of MI (1.7 vs. 4.2%, HR 0.407; 95% CI, 0.196–0.846; P = 0.016). The risk of TIMI major bleeding events was similar between the two groups (5.3 vs. 6.0%, RR 0.883; 95% CI, 0.537–1.453; P = 0.626).ConclusionFor patients who underwent OPCAB, individual DAPT (CYP2C19 genotype plus PAgT-guided strategy) was associated with a lower risk of MACE and a similar risk of major bleeding
Cognitive impairment in diffuse axonal injury patients with favorable outcome
Background and purposeTraumatic brain injury (TBI), especially the severe TBI are often followed by persistent cognitive sequalae, including decision-making difficulties, reduced neural processing speed and memory deficits. Diffuse axonal injury (DAI) is classified as one of the severe types of TBI. Part of DAI patients are marginalized from social life due to cognitive impairment, even if they are rated as favorable outcome. The purpose of this study was to elucidate the specific type and severity of cognitive impairment in DAI patients with favorable outcome.MethodsThe neurocognition of 46 DAI patients with favorable outcome was evaluated by the Chinese version of the Montreal Cognitive Assessment Basic (MoCA-BC), and the differences in the domains of cognitive impairment caused by different grades of DAI were analyzed after data conversion of scores of nine cognitive domains of MoCA-BC by Pearson correlation analysis.ResultsAmong the 46 DAI patients with favorable outcome, eight had normal cognitive function (MoCA-BC ≥ 26), and 38 had cognitive impairment (MoCA-BC < 26). The MoCA-BC scores were positively correlated with pupillary light reflex (r = 0.361, p = 0.014), admission Glasgow Coma Scale (GCS) (r = 0.402, p = 0.006), and years of education (r = 0.581, p < 0.001). Return of consciousness (r = −0.753, p < 0.001), Marshall CT (r = −0.328, p = 0.026), age (r = −0.654, p < 0.001), and DAI grade (r = −0.403, p = 0.006) were found to be negatively correlated with the MoCA-BC scores. In patients with DAI grade 1, the actually deducted scores (Ads) of memory (r = 0.838, p < 0.001), abstraction (r = 0.843, p < 0.001), and calculation (r = 0.782, p < 0.001) were most related to the Ads of MoCA-BC. The Ads of nine cognitive domains and MoCA-BC were all proved to be correlated, among patients with DAI grade 2. However, In the DAI grade 3 patients, the highest correlation with the Ads of MoCA-BC were the Ads of memory (r = 0.904, p < 0.001), calculation (r = 0.799, p = 0.006), orientation (r = 0.801, p = 0.005), and executive function (r = 0.869, p = 0.001).ConclusionDAI patients with favorable outcome may still be plagued by cognitive impairment, and different grades of DAI cause different domains of cognitive impairment
- …